# Quick Start - From Python

Here we present a broad overview of using the PyBigDFT library to drive BigDFT calculations using Python. If you have installed from source, you should make sure you have setup the proper environment variables using the following command:

source install/bin/bigdftvars.sh


## System Manipulation

Here we define a system which is compsed of two fragments: H2 and Helium.

[1]:

import warnings
warnings.filterwarnings("ignore")

[2]:

from BigDFT.Systems import System
from BigDFT.Fragments import Fragment
from BigDFT.Atoms import Atom
from BigDFT.Visualization import InlineVisualizer

[3]:

# Create Three Atoms
at1 = Atom({"r": [0, 0, 0], "sym": "H", "units": "bohr"})
at2 = Atom({"r": [0, 0, 1.4], "sym": "H", "units": "bohr"})
at3 = Atom({"r": [10, 0, 0], "sym": "He", "units": "bohr"})

# Construct a System from Two Fragments (H2, He)
sys = System()
sys["H2:1"] = Fragment([at1, at2])
sys["He:2"] = Fragment([at3])

# Iterate Over The System
for fragid, frag in sys.items():
for at in frag:
print(fragid, at.sym, at.get_position())

H2:1 H [0.0, 0.0, 0.0]
H2:1 H [0.0, 0.0, 1.4]
He:2 He [10.0, 0.0, 0.0]

[4]:

# NBVAL_IGNORE_OUTPUT
viz = InlineVisualizer(400, 300)
viz.display_system(sys)


You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

## Calculation

Calculate the created system using a grid spacing of $$0.4$$ and the PBE functional. A logfile is generated from which we can access the computed properties. This logfile has built in properties and can be accessed like a dictionary.

[5]:

from BigDFT.Inputfiles import Inputfile
inp = Inputfile()
inp.set_hgrid(0.4)
inp.set_xc("PBE")
inp["perf"] = {"calculate_forces": False,
"multipole_preserving": True}

[6]:

from BigDFT.Calculators import SystemCalculator
calc = SystemCalculator(skip=True, verbose=False)

[7]:

log = calc.run(sys=sys, input=inp, name="quick", run_dir="scratch")

[8]:

print(log.energy)
print(log.log["Memory Consumption Report"]
["Memory occupation"])

-4.053822554528733
{'Peak Value (MB)': 123.207, 'for the array': 'f_i', 'in the routine': 'vxcpostprocessing', 'Memory Peak of process': 'unknown'}


## Periodic Systems

We setup a BCC unit cell of iron and perform the calculation using a 2x2x2 k-point grid with a Monkhorst-Pack grid.

[9]:

from BigDFT.UnitCells import UnitCell

[10]:

pat = Atom({"Fe": [0, 0, 0], "units": "angstroem"})
psys = System({"CEL:0": Fragment([pat])})
psys.cell = UnitCell([2.867, 2.867, 2.867], units="angstroem")

[11]:

# NBVAL_IGNORE_OUTPUT
viz = InlineVisualizer(400, 300)
viz.display_system(psys)


You appear to be running in JupyterLab (or JavaScript failed to load for some other reason). You need to install the 3dmol extension:
jupyter labextension install jupyterlab_3dmol

[12]:

inp = Inputfile()
inp.set_hgrid(0.3)
inp.set_xc("LDA")
inp["kpt"] = {"method": "mpgrid", "ngkpt": [2, 2, 2]}

[13]:

log = calc.run(sys=psys, input=inp, name="psys", run_dir="scratch")

[14]:

# NBVAL_IGNORE_OUTPUT
_ = log.get_dos().plot()


## File I/O

Read and write a PDB file.

[15]:

from BigDFT.IO import read_pdb, write_pdb

[16]:

with open("scratch/temp.pdb", "w") as ofile:
write_pdb(sys, ofile)
with open("scratch/temp.pdb", "r") as ifile:


## Linear Scaling

Activate the linear scaling mode of BigDFT and compute some per fragment quantities.

[17]:

from BigDFT.PostProcessing import BigDFTool
from io import StringIO
from matplotlib import pyplot as plt
from BigDFT.Systems import plot_fragment_information

[18]:

inp = Inputfile()
inp.set_hgrid(0.55)
inp.set_psp_nlcc()  # Soft pseudopotentials
inp.set_xc("PBE")
inp["import"] = "linear"

[19]:

istr = """@<TRIPOS>MOLECULE
N3
48 0 0 0 0
SMALL
GASTEIGER

@<TRIPOS>ATOM
1  N        -15.9520   11.4820   75.1020 N.4     1  ALA         0.3855
2  H        -16.8590   11.1760   75.4220 H       1  ALA        -0.0890
3  H2       -15.9190   12.4910   75.1350 H       1  ALA        -0.0890
4  H3       -15.3270   11.2070   75.8460 H       1  ALA        -0.0890
5  CA       -15.6750   10.9380   73.6710 C.3     1  ALA        -0.0080
6  HA       -15.5000    9.8640   73.7340 H       1  ALA         0.0175
7  CB       -16.9370   11.1850   72.9780 C.3     1  ALA        -0.0702
8  HB1      -16.8760   10.8330   71.9480 H       1  ALA         0.0218
9  HB2      -17.7950   10.7240   73.4660 H       1  ALA         0.0218
10  HB3      -17.0660   12.2620   72.8690 H       1  ALA         0.0218
11  C        -14.3150   11.5090   73.1160 C.2     1  ALA         0.1989
12  O        -13.5240   11.9470   73.9260 O.2     1  ALA        -0.2776
13  N        -14.0770   11.2840   71.7960 N.am    2  VAL        -0.3051
14  H        -14.7740   10.8610   71.1990 H       2  VAL         0.1494
15  CA       -12.9310   11.8890   71.0690 C.3     2  VAL         0.1016
16  HA       -12.1370   12.1190   71.7800 H       2  VAL         0.0597
17  CB       -12.2680   10.8300   70.1710 C.3     2  VAL        -0.0197
18  HB       -13.0570   10.2210   69.7300 H       2  VAL         0.0318
19  CG1      -11.2730   11.3540   69.0870 C.3     2  VAL        -0.0605
20 HG11      -10.7560   10.5590   68.5490 H       2  VAL         0.0233
21 HG12      -11.7530   11.9770   68.3320 H       2  VAL         0.0233
22 HG13      -10.5100   11.8940   69.6470 H       2  VAL         0.0233
23  CG2      -11.4480    9.8090   70.9840 C.3     2  VAL        -0.0605
24 HG21      -11.9590    9.4330   71.8700 H       2  VAL         0.0233
25 HG22      -11.3140    8.9080   70.3860 H       2  VAL         0.0233
26 HG23      -10.4970   10.2560   71.2750 H       2  VAL         0.0233
27  C        -13.2980   13.1490   70.4060 C.2     2  VAL         0.2342
28  O        -14.1940   13.1280   69.5820 O.2     2  VAL        -0.2738
29  N        -12.5920   14.2640   70.6930 N.am    3  LEU        -0.3007
30  H        -11.8060   14.2210   71.3260 H       3  LEU         0.1496
31  CA       -12.8570   15.7250   70.2250 C.3     3  LEU         0.1249
32  HA       -13.7770   15.8070   69.6470 H       3  LEU         0.0616
33  CB       -12.9890   16.4610   71.5050 C.3     3  LEU        -0.0216
34  HB2      -12.3120   16.1130   72.2850 H       3  LEU         0.0291
35  HB3      -12.7420   17.5000   71.2880 H       3  LEU         0.0291
36  CG       -14.4560   16.4840   72.0620 C.3     3  LEU        -0.0445
37  HG       -15.0910   16.5870   71.1830 H       3  LEU         0.0296
38  CD1      -14.9030   15.1850   72.7940 C.3     3  LEU        -0.0626
39 HD11      -15.9610   15.3610   72.9910 H       3  LEU         0.0232
40 HD12      -14.8720   14.2770   72.1920 H       3  LEU         0.0232
41 HD13      -14.2220   15.0500   73.6340 H       3  LEU         0.0232
42  CD2      -14.5970   17.6630   72.9710 C.3     3  LEU        -0.0626
43 HD21      -15.6450   17.5910   73.2620 H       3  LEU         0.0232
44 HD22      -13.9320   17.5610   73.8280 H       3  LEU         0.0232
45 HD23      -14.3580   18.5620   72.4040 H       3  LEU         0.0232
46  C        -11.7070   16.2810   69.3360 C.2     3  LEU         0.3758
47  O        -10.5210   15.8410   69.5920 O.co2   3  LEU        -0.2443
48  OXT      -11.9460   17.0960   68.3940 O.co2   3  LEU        -0.2443
"""

[20]:

log = calc.run(sys=Lsys, input=inp, name="Lsys", run_dir="scratch")

[21]:

# NBVAL_IGNORE_OUTPUT
fig, axs = plt.subplots(2, 1)
btool = BigDFTool()
plot_fragment_information(axs[0], btool.fragment_population(Lsys, log))
axs[0].set_ylabel("Charge", fontsize=12)
axs[0].set_xticks([])
axs[0].set_xlabel("")
plot_fragment_information(axs[1], btool.run_compute_purity(Lsys, log))
axs[1].set_ylabel("Purity", fontsize=12)
fig.tight_layout()


## Remote Calculations

Calculations of large systems require the use of powerful supercomputers. One way to run on a supercomputer is to write a python script (or convert a Jupyter notebook), copy the script and related data to the machine, and run the calculation through the job system. As an alternative, we have developed the remotemanager framework for automatically launching your jobs. It works by wrapping arbitrary python functions that are then sent and run on the remote machine.

[22]:

def calculate(hgrid):
from BigDFT.Calculators import SystemCalculator
from BigDFT.Inputfiles import Inputfile
from BigDFT.Database.Molecules import get_molecule

sys = get_molecule("N2")
inp = Inputfile()
inp.set_hgrid(hgrid)
calc = SystemCalculator()
log = calc.run(sys=sys, input=inp, name=str(hgrid))

return log.energy

[23]:

from remotemanager import URL  # Replace this with the computer of your choice
from remotemanager import Dataset  # Store a set of remote calculations
url = URL()
ds = Dataset(function=calculate, url=url)
for h in [0.3, 0.35, 0.4]:
ds.append_run({"hgrid": h})
ds.run()

appended run runner-0
appended run runner-1
appended run runner-2
assessing run for runner dataset-54fed8e7-runner-0... checks passed, running
assessing run for runner dataset-54fed8e7-runner-1... checks passed, running
assessing run for runner dataset-54fed8e7-runner-2... checks passed, running

[24]:

from time import sleep
while not all(ds.is_finished):
sleep(10)
ds.fetch_results()
print(ds.results)

checking remotely for finished runs
checking remotely for finished runs
[-19.888866344196096, -19.888459196244888, -19.88727486068828]


For simpler runs, you can use Jupyter magics for the same effect.

[25]:

%load_ext remotemanager

[26]:

%%sanzu url=url
%%sargs hgrid = 0.35
from BigDFT.Logfiles import Logfile
log = Logfile("log-" + str(hgrid) + ".yaml")
log.log["Memory Consumption Report"]["Memory occupation"]

runner runner-1 already exists
assessing run for runner dataset-469c350d-runner-0... skipping already completed run
checking remotely for finished runs

[27]:

print(magic_dataset.results)

[{'Peak Value (MB)': 101.461, 'for the array': 'zt', 'in the routine': 'G_PoissonSolver', 'Memory Peak of process': 'unknown'}]